Info

Note that an API key is also required via (censusmapper)[censusmapper.ca] and you need to set the cache location as well. This has been commented out of the code but you will need to include it in your code.

require("cancensus")
## Loading required package: cancensus
## Census data is currently stored temporarily.
## 
##  In order to speed up performance, reduce API quota usage, and reduce unnecessary network calls, please set up a persistent cache directory by setting options(cancensus.cache_path = '<path to cancensus cache directory>')
## 
##  You may add this option, together with your API key, to your .Rprofile.
require("tidyverse")
## Loading required package: tidyverse
## ── Attaching packages ────────────────────────────────────────────────── tidyverse 1.2.1 ──
## ✔ ggplot2 2.2.1     ✔ purrr   0.2.4
## ✔ tibble  1.4.1     ✔ dplyr   0.7.4
## ✔ tidyr   0.7.2     ✔ stringr 1.2.0
## ✔ readr   1.1.1     ✔ forcats 0.2.0
## ── Conflicts ───────────────────────────────────────────────────── tidyverse_conflicts() ──
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag()    masks stats::lag()
require("sf")
## Loading required package: sf
## Linking to GEOS 3.6.1, GDAL 2.1.3, proj.4 4.9.3
require("sp")
## Loading required package: sp
require("rgdal")
## Loading required package: rgdal
## rgdal: version: 1.2-16, (SVN revision 701)
##  Geospatial Data Abstraction Library extensions to R successfully loaded
##  Loaded GDAL runtime: GDAL 2.1.3, released 2017/20/01
##  Path to GDAL shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/gdal
##  GDAL binary built with GEOS: FALSE 
##  Loaded PROJ.4 runtime: Rel. 4.9.3, 15 August 2016, [PJ_VERSION: 493]
##  Path to PROJ.4 shared files: /Library/Frameworks/R.framework/Versions/3.4/Resources/library/rgdal/proj
##  Linking to sp version: 1.2-5
require("leaflet")
## Loading required package: leaflet
#Set API key - this will store it for future use
#DO NOT MAKE YOUR KEY PUBLIC

census_api_key = "MASKED"
census_cache_path = 'MASKED'


options(cancensus.api_key = census_api_key)
options(cancensus.cache_path = census_cache_path)

Background

Census Data


cancensus Package


To see the census periods available:

list_census_datasets()
## Querying CensusMapper API for available datasets...

Census data measures each individual. Because of privacy, the data is aggregated to spatial areas. The spatial breakdown roll up. The levels most commonly used are:

To see the spatial areas available and the codes that correspond:

list_census_regions("CA16")
## Querying CensusMapper API for regions data...

To find the name of the region for Edmonton, it’s easiest to search for it:

list_census_regions("CA16") %>% filter(name == 'Edmonton')
## Querying CensusMapper API for regions data...

Objective: Examine the median income for households in Alberta

Use CSD level to limit data (DA is too small) Use 2011 census Examine the distribution See the spatial distribution

Obtain the data first:

AB <- get_census(dataset='CA16', 
                 regions=list(PR="48"),
                 vectors="v_CA16_2397", 
                 level='CSD', 
                 quiet = TRUE, 
                 geo_format = 'sf', 
                 labels = 'short')

print(AB[, c(5, 7, 14)])
## Simple feature collection with 425 features and 3 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -120.0027 ymin: 48.99674 xmax: -109.9996 ymax: 60.00006
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## # A tibble: 425 x 4
##    GeoUID  name                   v_CA16_2397                     geometry
##    <chr>   <chr>                        <dbl>       <sf_geometry [degree]>
##  1 4805002 Carmangay (VL)               44256 MULTIPOLYGON (((-113.1041 5…
##  2 4817855 Desmarais (S-�)           108288 MULTIPOLYGON (((-113.7885 5…
##  3 4812815 Cold Lake 149B (IRI)         34688 MULTIPOLYGON (((-110.2011 5…
##  4 4813011 Sunset Point (SV)            96000 MULTIPOLYGON (((-114.3449 5…
##  5 4808028 Gull Lake (SV)               85589 MULTIPOLYGON (((-113.9273 5…
##  6 4808024 Eckville (T)                 74837 MULTIPOLYGON (((-114.3673 5…
##  7 4808025 Half Moon Bay (SV)              NA MULTIPOLYGON (((-114.1697 5…
##  8 4814003 Yellowhead County (MD)       92544 MULTIPOLYGON (((-116.699 54…
##  9 4813007 Yellowstone (SV)             91392 MULTIPOLYGON (((-114.3856 5…
## 10 4813013 Birch Cove (SV)                 NA MULTIPOLYGON (((-114.3743 5…
## # ... with 415 more rows

Check the summary statistics for viewing:

summary(AB$v_CA16_2397)
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   20928   61664   76352   76631   89613  195570      53

Create a histogram of the median:

ggplot(data=AB, aes(AB$v_CA16_2397)) + 
  geom_histogram(breaks=seq(0, 250000, by = 10000), 
                 col="black", 
                 fill="blue", 
                 alpha = .9) + 
  labs(title="Histogram for Median Income") +
  labs(x="Median Income", y="Count") 
## Warning: Removed 53 rows containing non-finite values (stat_bin).

Basic Spatial Plot, automatic

plot(AB[14], main = "AB Median Household Income by CSD")

Interactive Plot via, leaflet

#labels for pop-ups
income_labels <- sprintf(
  "<strong>%s</strong><br/>%g Median Household Income",
  AB$name, AB$v_CA16_2397
) %>% lapply(htmltools::HTML)

#bins for colours/legends
bins <- c(0, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 90000, 100000, 125000, 150000, 200000, Inf)
#colour palette selection
pal <- colorBin("RdYlBu", domain = AB$v_CA16_2397, bins = bins)

#create graph
leaflet(AB) %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(fillColor = ~pal(v_CA16_2397),
              color = "white",
              weight = 1,
              opacity = 1,
              fillOpacity = 0.65,
              label = income_labels,
              highlight = highlightOptions(
                weight = 5,
                color = "#666",
                dashArray = "",
                fillOpacity = 0.7,
                bringToFront = TRUE)) %>%
  addLegend("bottomright", pal = pal , values = ~v_CA16_2397, opacity = 1, title="Median Household Income by CSD")
## Warning in RColorBrewer::brewer.pal(max(3, n), palette): n too large, allowed maximum for palette RdYlBu is 11
## Returning the palette you asked for with that many colors

## Warning in RColorBrewer::brewer.pal(max(3, n), palette): n too large, allowed maximum for palette RdYlBu is 11
## Returning the palette you asked for with that many colors

Unfortunately Edmonton is a single CSD, so this doesn’t give us any indication of the distribution within Edmonton.

Repeat for only Edmonton, at a DA level

edmonton <- get_census(dataset='CA16', regions=list(CMA="48835"), vectors=c("v_CA16_2397"), level='DA', geo_format = "sf", labels='short', quiet=TRUE)


income_labels <- sprintf(
  "<strong>%s</strong><br/>%g Median Income",
  edmonton$GeoUID, edmonton$v_CA16_2397
) %>% lapply(htmltools::HTML)

#names(edmonton)


bins <- c(0, 10000, 20000, 30000, 40000, 50000, 60000, 70000, 90000, 100000, 125000, 150000, 200000, Inf)
pal <- colorBin("RdYlBu", domain = edmonton$v_CA16_2397, bins = bins)
leaflet(edmonton) %>% 
  addProviderTiles(providers$CartoDB.Positron) %>%
  addPolygons(fillColor = ~pal(edmonton$v_CA16_2397),
              color = "white",
              weight = 1,
              opacity = 1,
              fillOpacity = 0.65,
              label = income_labels,
              highlight = highlightOptions(
                weight = 5,
                color = "#666",
                dashArray = "",
                fillOpacity = 0.7,
                bringToFront = TRUE)) %>%
  addLegend("bottomright", pal = pal , values = ~edmonton$v_CA16_2397, opacity = 1, title="Average Household Income by DA")
## Warning in RColorBrewer::brewer.pal(max(3, n), palette): n too large, allowed maximum for palette RdYlBu is 11
## Returning the palette you asked for with that many colors

## Warning in RColorBrewer::brewer.pal(max(3, n), palette): n too large, allowed maximum for palette RdYlBu is 11
## Returning the palette you asked for with that many colors

Thanks for reading.

Credit to: @vb_jens, Jens von Bergmann at Censusmapper who developed and supports this package.